RUSSE: The First Workshop on Russian Semantic Similarity
نویسندگان
چکیده
The paper gives an overview of the Russian Semantic Similarity Evaluation (RUSSE) shared task held in conjunction with the Dialogue 2015 conference. There exist a lot of comparative studies on semantic similarity, yet no analysis of such measures was ever performed for the Russian language. Exploring this problem for the Russian language is even more interesting, because this language has features, such as rich morphology and free word order, which make it significantly different from English, German, and other wellstudied languages. We attempt to bridge this gap by proposing a shared task on the semantic similarity of Russian nouns. Our key contribution is an evaluation methodology based on four novel benchmark datasets for the Russian language. Our analysis of the 105 submissions from 19 teams reveals that successful approaches for English, such as distributional and skip-gram models, are directly applicable to Russian as well. On the one hand, the best results in the contest were obtained by sophisticated supervised models that combine evidence from different sources. On the other hand, completely unsupervised approaches, such as a skip-gram model estimated on a largescale corpus, were able score among the top 5 systems.
منابع مشابه
On the nature Of Semantic Similarity and it’S meaSuring with diStributiOnal SemanticS mOdelS
The paper describes our application of the distributional semantic model (DSM) method that we developed for The First International Workshop on Russian Semantic Similarity Evaluation (RUSSE) shared relatedness task. The model was trained, for the most part, on the data of the Russian National Corpus main subcorpus (around 200 mln tokens), and the resulting vector space was weighted according to...
متن کاملGathering Information About Word Similarity from Neighbor Sentences
In this paper we present the first results of detecting word semantic similarity on the Russian translations of Miller-Charles and Rubenstein-Goodenough sets prepared for the first Russian word semantic evaluation Russe-2015. The experiments were carried out on three text collections: Russian Wikipedia, a news collection, and their united collection. We found that the best results in detection ...
متن کاملOn the Role of Derivational Processes in the Formation of Non-Taxonomic Classes of Lexical Units in Russian
The paper is focused on classes of lexical units which arise as a result of derivational processes – word formation and semantic transfers, acting either in isolation or together, on the basis of common semantic foundations that bind targets and sources of derivation. The lexical items which constitute the classes under study vary in their denotative characteristics and due to their categ...
متن کاملHuman and Machine Judgements for Russian Semantic Relatedness
Semantic relatedness of terms represents similarity of meaning by a numerical score. On the one hand, humans easily make judgements about semantic relatedness. On the other hand, this kind of information is useful in language processing systems. While semantic relatedness has been extensively studied for English using numerous language resources, such as associative norms, human judgements and ...
متن کاملTexts in, meaning out: neural language models in semantic similarity task for Russian
Distributed vector representations for natural language vocabulary get a lot of attention in contemporary computational linguistics. This paper summarizes the experience of applying neural network language models to the task of calculating semantic similarity for Russian. The experiments were performed in the course of Russian Semantic Similarity Evaluation track, where our models took from 2nd...
متن کامل